Cert Research Annual Report

نویسنده

  • Carnegie Mellon
چکیده

Botnet size is often reported as a number of IP addresses, but the link between IP addresses and infected machines is more complicated than a simple one-to-one relationship. To count the number of infected machines when we have only an aggregated view of a botnet, we suggest building a precise probability model of the observable behavior of a single machine, and applying that model to the aggregates to obtain a population estimate. As an example, we build a probability model of the peer-to-peer (P2P) scanning activity of the Conficker-C botnet, and we use this model to estimate the number of active infected machines per hour over a two-month window. Problem Addressed When new botnets emerge, the classic question is, “How big is it?” In practice, population estimates are often made public in online articles or security blogs, where it is difficult to determine the methods used to obtain either the estimate or the margin of error. Population estimation has a long history in the statistical literature, but methods often involve complex models that track each individual unit in the population over time. These methods are difficult to implement over millions of individuals on an Internet-wide scale, and they also assume an observer has a direct view of the units of interest– individual machines. Often the view of infected machines in a botnet is filtered through IP space. The existence of Network Address Translation (NAT), proxies and Dynamic Host Configuration Protocol (DHCP) leases in IPv4 space complicates the link between IP addresses and machines. For example, if 100 infected machines are scanning a network from behind a NAT gateway, the outside observer sees only the aggregate scan attempts from all 100 hosts, coming from the single gateway IP address. In this research, we consider a principled but computationally approachable model for measuring the number of active hosts per hour in a botnet, based on modeling activity for a single host. We apply our method to the Conficker-C botnet that emerged in March of 2009. Research Approach Our method relies on precisely describing a probability distribution for the observable behavior of a single infected host, and using the expected value of the distribution to represent a single host when we observe only aggregated measurements. In this context, “observable behavior” is a quantitative measurement that we can see with a network telescope, for example scanning rates, beaconing frequencies, or a count of remote file downloads. The behavioral model should account for network variability, empirical testing results, and any stochastic elements in either the underlying protocol or measurement method. It can be informed by sandbox experiments on the malware associated with a botnet, or from reverse engineering of the botnet’s source code. Once developed, the model provides a theoretical average value for the measured behavior across all hosts, as well as a measure of variance. Given an aggregated count, for example a stream of scan attempts coming from a gateway IP address, we divide the observed value by the theoretical average in order to obtain the population estimate. Statistical theory [1] proves that this estimate is unbiased for the true population when the model is precise and the theoretical average is accurate. We can also obtain a margin of error for the population size that is informed by the variance of the single-host probability distribution. Expected Benefits As Internet addressing moves away from the single-host, single-IP model, for example with the highly ephemeral nature of IPv6 addresses, behavioral models based on single-host probability distributions will still be able to provide measurements and metrics in terms of counts of machines. This ability allows for more interpretable and concrete results when we discuss the size of a population. Furthermore, not only can this methodology be used to track the size of homogenous populations such as botnets, but it can also be used to build inventory models for network situational awareness when the network monitoring agencies have either limited sensor placement, or only a high-level view of a very large network. 2008 Accomplishments In 2008, we used the single-host probability model to study the Conficker botnet. The Conficker-C worm variant that propagated through hosts infected with Conficker-A and Conficker-B in March 2009 introduced a specific pattern of peer-to-peer (P2P) activity using both TCP and UDP protocols. When a host infected with Conficker-C comes online, it searches for peers by randomly generating a set of destination IP addresses across most of IPv4 space, attempting connections to these hosts, and adding any peers it encounters to an internal list. Connection ports are based on a deterministic algorithm that uses the source IP address and date. This algorithm was the key to a behavioral signature for the identification of Conficker-C P2P traffic with high reliability that can be observed in the largescale summary information contained in network flow data. We used the behavioral signature to obtain counts of UDP scan attempts from external Conficker-infected IP addresses into our network of approximately 21,000 class C net blocks (.15% of IPv4 space), recorded hourly from the period of March 5th through April 24th, 2009. Overall, we observed 38 million unique IP addresses during the 2-month period, across 1.09 million class C net blocks. Figure 1 displays the distribution of the average scan attempts per hour by class C net block using a log-log plot. While the majority of net blocks scanned our network between 3 and 5 times per hour when active, NATs and gateways show up as a long tail in the distribution, with some blocks scanning our network an average of over 1000 times per hour.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Consensus on Exercise Reporting Template (CERT): Explanation and Elaboration Statement.

Exercise is effective for prevention and management of acute and chronic health conditions. However, trial descriptions of exercise interventions are often suboptimal, leaving readers unclear about the content of effective programmes. To address this, the 16-item internationally endorsed Consensus on Exercise Reporting Template (CERT) was developed. The aim is to present the final template and ...

متن کامل

Crystal Structure of the Pleckstrin Homology Domain from the Ceramide Transfer Protein: Implications for Conformational Change upon Ligand Binding

Ceramide transfer protein (CERT) is responsible for the nonvesicular trafficking of ceramide from the endoplasmic reticulum (ER) to the trans Golgi network where it is converted to sphingomyelin (SM). The N-terminal pleckstrin homology (PH) domain is required for Golgi targeting of CERT by recognizing the phosphatidylinositol 4-phosphate (PtdIns(4)P) enriched in the Golgi membrane. We report a ...

متن کامل

UCMEXUS FINAL PROJECT REPORT Collaborative Environmental Chamber and Modeling Studies for Evaluating Effects of Emissions on Air Quality in Mexico City

CERT) and the Mexican Petroleum Institute (IMP) have been carried out. The areas of collaboration were in air quality modeling simulation, environmental chamber experimentation, and chemical reaction mechanisms. During the two-year UCMEXUS project the IMP researchers visited CE-CERT on two separate occasions. Extensive discussions concerning these research areas were held and direct participati...

متن کامل

Primary incisor decay before age 4 as a risk factor for future dental caries.

The purpose of this investigation was to determine whether early childhood caries (ECC) is a risk factor for future dental caries. One hundred fifteen dental charts of children younger than 4 years of age when initially treated were reviewed and abstracted for primary incisor caries and age at the initial examination, gender, recall dental visits, sealants, and age at the last dental examinatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014